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Abstract. Bayesian approach is becoming increasingly important as 
it provides many advantages in dealing with complex data. However, 
there is no well-defined model selection criterion or index in a Bayesian 
context. To address the challenges, new indices are needed. The goal of 
this study is to propose new model selection indices and to investigate 
their performances in the framework of latent growth mixture models 
with missing data and outliers in a Bayesian context. We consider 
latent growth models because they are very flexible in modeling complex 
data and becoming increasingly popular in statistical, psychological, 
behavioral, and educational areas. Specifically, this study conducted five 
simulation studies to cover different cases, including latent growth curve 
models with missing data, latent growth curve models with missing data 
and outliers, growth mixture models with missing data and outliers, 
extended growth mixture models with missing data and outliers, and 
latent growth models with different classes. Simulation results show that 
almost all proposed indices can effectively identify the true model. This 
study also illustrated the application of these model selection indices in 
real data analysis. 
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1 Introduction 


Bayesian approach is becoming increasingly important in estimating models as 


it provides many advantages in dealing with complex data (e.g., [Dunson] 2000). 


However, there is no well-defined model selection criterion or index in a Bayesian 


context (e.g.,|Celeux, Forbes, Robert, & Titterington||2006). It is due to at least 


three problems. First, in a Bayesian context there are two versions of deviance 
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because the Bayesian procedure generates Monte Carlo Markov chains for each 
parameter. One version is the posterior estimate, which can be estimated by 
a function of an estimate of a parameter. Another version is the Monte Carlo 
estimate of the expected deviance based on Bayesian iterations, which can be 
estimated as the posterior mean of a converged Markov chain. In short, the 
former is the deviance of the averaged estimates, and the latter is the average 
of all deviance iterations. The second problem is related to the complexity 
of the raw data. The data often come from heterogeneous populations which 
almost unavoidable contain outliers and missing values. The estimates from 
mis-specified models may result in severely misleading conclusions. The third 
problem relates to the likelihood function. When latent variables are considered 
in statistical models, the likelihood function can be an observed-data likelihood 
function, a complete-data likelihood function, or a conditional likelihood function 
(Celeux et al.|[2006). Furthermore, if data come from heterogeneous populations, 
the class membership indicator may have different versions, for example, a 
posterior mode or a posterior mean. Also, with missing data, the likelihood 
functions have different ways to construct. 


1.1 Model Selection Criteria/Indices 


Traditional model selection criteria or indices are available for researchers who 
try to select the best-fit model from a large set of candidate models. 
proposed the Akaike's information criterion (AIC), which offers a relative 
measure of the information lost. For Bayesian models the Bayes factor, which is 
the ratio of posterior odds to prior odds, can work for both hypothesis testing 
and model comparison. But the Bayes factor is often difficult or impossible 
to calculate, especially for models that involve random effects, large numbers 
of unknowns or improper priors. To approximate the Bayes factor, 
(1978) developed the Bayesian information criterion (BIC, sometimes called the 
Schwarz criterion). To obtain more precise indices, |Bozdogan| (1987) proposed 
the consistent Akaike information criterion (CAIC) Шаға ші Т) proposed 
the sample-size adjusted ELS das criterion (ssBIC). Th dr RA 
information criterion (DIC; |S is a 
recently developed criterion designed for hierarchical models. It c ЕШ on 
the posterior distribution of the log-likelihood and is useful in Bayesian model 
selection problems where the posterior distributions have been obtained by 
Markov chain Monte Carlo (MCMC) simulation. DIC is usually regarded as 
a generalization of AIC and BIC. It is defined analogously to AIC or BIC 
with a penalty term of the number equal to effective model parameters in 
Bayesian models. In practice, rough DIC (RDIC or DICV in some literature, 
е g., Oldmeadow & Keith} |2011) is an approximation of DIC. The mathematical 
forms of AIC, BIC, CAIC, ssBIC, and DIC are closely related to each other. 
They all try to find a balance between the accuracy and the complexity of the 
fitting model. For all indices above, the model with a smaller criterion/index 
value is better supported by data. 
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Lu, Zhang, and Соһеп (2013) proposed a series of Bayesian model selection 
indices based on the traditional ones. However, in (2013) the 


performances of these indices were investigated when data were non-mixture, 
normally distributed, and with simple non-ignorable missingness. And only 
latent growth models were used. 


1.2 Goals and Structure 


To address the challenges in model selection criterion/index in a Bayesian 
context, this paper proposes ten model selection indices. This paper also 
examines the performance of these indices under various conditions by 
conducting five simulation studies to cover different latent growth models, 
such as the robust growth models for non-normally distributed data, robust 
growth mixture models, and the extended robust growth mixture models with 
missing values. We consider latent growth models because they are very flexible 
in modeling complex data and becoming increasingly popular in statistical, 
psychological, behavioral, and educational areas. 

The rest of the article consists of five sections. Section 2 presents and 
formulates three types of models we used in this paper: latent growth models 
(including growth curve models, growth mixture models, and extended growth 
mixture models), robust growth models (including three types of robust 
models), and models that account for missingness (we focus on non-ignorable 
missingness). Section 3 proposes ten model selection indices in the framework of 
Bayesian growth models with missing data. Section 4 conducts five simulation 
studies to evaluate the performance of the Bayesian indices. Model selection 
results are analyzed, summarized, and compared. Section 5 illustrates the 
application of these model selection indices in real data analysis. Section 6 
discusses the implications and future directions of this study. 


2 Latent Growth Models, Robust Growth Models, and 
Missing Values 


Our investigation of the performance of the Bayesian selection indices involves 
fitting growth models to complex data. In this section, different types of growth 
models are briefly introduced. Given the fact that the data used in growth models 
are almost inevitably contain attrition (e.g.,|Little & Rubin| 
Ілімі and outliers (e.g.,|Maronna, Martin, & Yohai 
2006), different types of growth models are developed, which include traditional 
latent growth curve models with missing data |2013), robust growth 
curve models (Zhang, Lai, Lu, & Tong} |2013) with missing data 
[2021], growth mixture models (e.g., Bartholomew & Knott} with missing 


data (Lu & Zhang| |2014), extended growth mixture models (EGMMs, [Muthén] 
& Shedden||1999) with missing data (Lu & Zhang} |2014), and robust growth 
Lu & Zhang} |2014). 


mixture models with missing data ( 
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In the following, we discuss three types of models: latent growth models 
(including growth curve models, growth mixture models, and extended growth 
mixture models), robust growth models (including three types of robust 
models), and models that account for missingness (we focus on non-ignorable 
missingness). By combining different elements of these models, it becomes 
possible to consider a series of growth models with a variety of missing data 
mechanisms and contaminated data. 


2.1 Latent Growth Models 


The mathematical form of a latent growth curve model is 


yi = An, +е; 
2. ” 0 


where у; is а Т x 1 vector of outcomes for participant i(i = 1,..., N), n; is а 
q X 1 vector of latent effects, A is а Т х q matrix of factor loadings for 7;, е; is а 
Т x 1 vector of residual or measurement errors, {3 is a qx 1 vector of fixed-effects, 
and £; captures the variation of 7;. We have to note that е; and £; are usually 
assumed normally distributed but not necessary. When data have outliers and 
are heavy-tailed, this assumption might cause estimate biases. To reduce the 
effects of outliers, we consider robust models in this study. 
A growth mixture model can be expressed as 


K 
flys) = 3 n Му), (2) 
k=1 


where ть is the invariant class probability (or weight) for class k satisfying 
4.4... 
1,..., K) is the density of a latent growth model for class К. 

ог extended growth mixture models (EGMMs, 11999), 
ть is not invariant across individuals. It is allowed to vary individually depending 
on covariates, so it is expressed as л;к(х;). If a probit link function is used, then 


та (Xi) = 9(X; v) 
түк (Xi) = D(X} Pr) т D(X; x1); (k = 2,3,..,К- 1), (3) 
Tik (Xi) =1—Ф(Х оқұ 1) 


where Ф(.) is the cumulative distribution function (CDF) of the standard normal 
distribution, and X; = (1, х7)! with an r x 1 vector of observed covariates x;. 
Note that Ф(Х} pr) = У ута (Xi) and P(X} y) = 1. 

A dummy variable z; = (2,1, 210,...,2:к)' is used to indicate the class 
membership. If individual i comes from group k, 2 = 1 and zi; = 0 (Vj z k). 


zi is multinomially distributed (McLachlan & Peel, |2000) p.7), that is, 2; ~ 


MultiNomial (т, Tiz, ..., тк). 
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2.2 Robust Growth Models 


When data have outliers and are heavy-tailed, robust methods are used to 
reduce the effects of outliers. As t-distributions are more robust than normal 


distributions, the following are robust growth models (Lu & Zhang\|2021} 
2013). 


(1) t-Normal (TN) model іп which the measurement errors are t-distributed 
and the latent random effects are normally distributed, 


е; ~ Мт(0, Ө, и) (4) 
€; ~ MN, (0,¥) : 


where Mtr(0,0,v) is a T-dimensional multivariate t-distribution with a scale 
matrix Ө and degrees of freedom v, and M N,(0,V) is а q-dimensional 
multivariate Normal distribution with a covariance matrix V. 
(2) Normal-t (NT) model in which the measurement errors are normally 
distributed but the latent random effects are t-distributed, 
е; ~ М № (0, Ө) (5) 
&; ~ МЕ, (0,Ф, и) ` 
(3) t-t (TT) model in which both the measurement errors and the latent 
random effects are t-distributed, 


е; ~ Mtr(0,0,v) 6 
Ei ~ Mt,(0,W,u) ` (6) 


2.3 Missing Values 


We focus on the non-ignorable missingness in this paper. To build models with 
non-ignorable missingness, selection models 
are used. For individual i, let m; = (mi1, mij, ..., Mir)’ be a missing 
data indicator for y;, with m;, = 1 when yj; is missing and 0 when observed. Let 
Tit = р(т = 1) be the probability that у is missing. Then mj, ~ Bernoulli(7;), 
so its density function is /(ти) = т?“(1 — ти) 0"). The missingness 
probability т, can have different forms. proposed the 
following non-ignorable missingness mechanisms for mixture models. 

(1) Latent-Class-Intercept-Dependent (LCID) missingness in which rj, is 
a function of latent class, covariates, and latent individual initial levels. For 
example, students are more likely to miss a test if their starting levels of that 
course are low. We model it as follows: 


Tit = Фа уа + Lyn + X; Yat) (7) 


where J; is the latent initial levels for individual i, уг; is the coefficient for 
li, “Ул is the coefficient for class membership, and “у,, are coefficients for 
covariates. For non-mixture homogenous growth models, LCID can be simplified 
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to Latent-Intercept-Dependent (LID) without the class membership indicator 2; 
and expressed as 7j; = ®(yor + Livre + х7, |), Where Yor is the intercept. 

(2) Latent-Class-Slope-Dependent (LCSD) missingness in which T; is a 
function of latent class, covariates, and latent individual slopes of growth. For 
example, students are more likely to miss a test if they have slow growth of the 
course. In this case, т can be modeled as 


Tit — D2 ot + Si^yst F узе), (8) 


where S; is the latent slope for individual i, and ys; is the coefficient for S;. 
Similarly, for non-mixture homogeneous growth models, LCSD is simplified to 
Latent-Slope-Dependent (LSD) case as ти = Ф(%о + 5:73: + Ху). 

(3) Latent-Class-Outcome-Dependent (LCOD) missingness in which т; is a 
function of latent class, covariates, and potential outcomes that may be missing. 
For example, a student who feels he is not doing well on the test may be more 
likely to give up taking the rest of the test. We express Tj, as 


Tit = (20У, + уни + Xia). (9) 
where yi; is the potential outcomes for individual % at time t, and ^y; is the 
coefficient for yi. And LCOD can be simplified to Latent-Outcome-Dependent 
(LOD) for non-mixture homogeneous growth models with a probability of 
missingness Ti, = P(Yor + Yit Үн + Xii) 

In a more general framework, LCID and LCSD can be further encompassed 
into Latent-Class-Random Effect-Dependent missingness as intercept and 
slope are different random effects according to different situations under 
consideration. And for non-mixture structure, LID and LSD are encompassed 
into Latent-Random Effect-Dependent missingness. 


3 Bayesian Model Selection Indices 


In this section, we propose ten model selection criteria in the framework of 
Bayesian growth models with missing data. The definitions of the selection 
criteria are listed in Table The model selection criteria in the table are 
based on two versions of deviance in the Bayesian context, Ер|у|Р(0)| and 
D(Eg,[0]). As we have discussed in the introduction section, F),[D] is the 
expected value of all the deviances, and D(Es,[0]) is the deviance score based оп 
the expected parameters. For different models, the detailed mathematical forms 
of these two deviances are different. In this paper, we focus on both homogeneous 
and heterogeneous latent growth models with non-ignorable missing data. 

We first look at the homogeneous growth curve models with non-ignorable 
missing data. One version of deviance, Ep, |D(0)], is approximated by 


кыр» DO) =— = УУ YU lym) 
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Table 1. Model Selection Indices 


Index = Deviance + Penalty 
Dbar.AIC! (0) 2p 

Dbar.BIC? D(0) log(N) p 
Dbar.CAIC D(0) (log(N)--1) p 
Dbar.ssBIC D(0) log((N+2) /24) p 
RDIC D(0) var(Dbar)/2 
Dhat.AIC D(6)° 2p 

Dhat.BIC | D(0) log(N) p 
Dhat.CAIC D(6) (log(N)+1) p 
Dhat.ssBIC D() log((N+2) /24) p 
DIC? D(6) 2pD 


Note. 
1. p is the number of parameters. 
2. N is the sample size. 
3. pD = D(0) — D(6). 
4. D(0) is shown as in eqn.(10) for growth curve models and as in eqn. (13) for 
growth mixture models. 


5. D(0) is shown as in eqn. (12) for growth curve models and as in eqn. (14) for 
growth mixture models. 


where S is the number of iterations for converged Markov chains, 19(0 y,m) = 
log( L (Oly, m)) is a conditional joint loglikelihood function (see, Celeux et al.| 
of y and m, mj, is the missing data indicator for individual ? at time t 
with a likelihood function (т) = Milog(Tit) + (1 — mj )log(1 — та), where 
Та is the missing data rate for individual i at time t and is defined differently 
for different missingness models as in the previous section. When yj; is missing, 
the corresponding likelihood is excluded. So combining y and m, the conditional 
likelihood function of a selection model with non-ignorable missing data can be 
expressed as 


La(0|y, m) = [Рт n 9 тт, (11) 


And the other version of deviance, D(Esj,[0]), is approximated by 


N T 
D(Eoy[0]) = DÔ) = -25 Y |a = та) (и) + að), (2) 


i=l t—1 


where Ó is the posterior mean of parameter estimates across S iterations. 
For growth mixture models with missing data, Eg, [D] is expressed as 


е21р00) DO) = 5379 2 [0 ти) + (т) (13) 


wn 
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where 2; = (zil, 2i2,---, Zik) is the class membership indicator which follows a 
multinomial distribution, 2; ~ MultiNomial(mj1, Tiz,- TiK), and 29 is the 
class membership estimated at iteration s. And 


T 
D(Eojy[6) = D(B) = -29 Уға (0 та) (0) (ті). (14) 


where 2 is the posterior mode of class membership, 0 is the posterior mean of 
parameter estimates across all S iterations. In both D(0) and D(@) definitions 
of deviance, lixt(y) and (тї) are the conditional loglikelihood functions for yit 
and mij, respectively, for individual 7 in class k at time t. 


The difference between D(0) and D(0) can be quantified by a statistic called 


pD (Spiegelhalter et al.{ |2002), 


pD = Б(@)— DÔ). (15) 


Based on the Jensen’s inequality (Casella 42 George}|1992), when D(0) is convex, 


then D(0) > D(0) and as a result pD is positive. When D(@) is concave, then 


D(0) € D(0) and pD is negative. 


4 Simulation Studies 


In this section, five simulation studies are conducted to evaluate the performance 
of the Bayesian indices. For each study, four waves of complete data are 
generated first and then missing data are created on each occasion according 
to pre-designed missing data rates. After data are generated, full Bayesian 
methods are used by adopting uninformative priors, obtaining conditional 
posterior distributions through application of a data augmentation algorithm, 
generating Markov chains through a Gibbs sampling procedure, conducting 
convergence testing, and making statistical inference for model parameters. For 
all simulations, the software OpenBUGS is used for the implementation of Gibbs 
sampling, and R is used for data-generation, convergence testing, and parameter 
estimation. 

'The five studies are designed such that the data complexity increases from 
study 1 to study 5. Studies 1-2 focus on non-mixture growth data and thus, 
latent growth curve models with missing data are used. Studies 3-5 focus on 
mixture growth data and thus, growth mixture models with missing data are 
used. Simulation factors include measurement error distributions, random-effects 
distributions, missingness patterns, sample size, and class separation 


& Bahadur||1962). Under each condition, 100 converged replications are used to 


calculate the model selection proportion. Table [2] lists the design details. 
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Table 2: Simulation Study Design 


Distribution Missingness Sample Size Class 
Study Model ei? m? Depends оп Separation!! 


NPN t COX ES? Y™ Different М 5 


Study 1 Normal LGCMs: use relative small sample sizes due to single-class 
data 


NN-ignorable v “ 
Х-ХІ 4 vV 
NN-XS! 4 vV 
NN-XY “ 4 


v 


Study 2 Robust LGCMs: use relative small sample sizes due to single-class 
data 


TN-ignorable 
TN-XI 
TN-XS 
TN-XY 


<< << 


TT-ignorable 
ТТ-ХІ 
TT-XS 
ТТ-ХҮ 


лақ уж 


NT-ignorable 
NT-XI 
NT-XS 
NT-XY 


<<< A OUR 


NN-ignorable 


72 
NN-XI 4 
V 


<<< | e ч чч ч ч NSSS 


v 
v 
v 
v 
v 
v 
v 
v 


v 


Study 3 Robust GMMs (RGMMs): use relative large sample sizes due to 
multiple classes data, and use small class separation 
due to fixed class probabilities 


TN-ignorable Vv “ 4 
TN-XI Vv Vv 4 
TN-XS Vv “ Уу 4 
TN-XY Vv 4 “ 72 
''I-ignorable "4 “ "4 72 
ТТ-ХІ 4 “ Vv 4 
TT-XS 4 v “ vy 4 
TT-XY 4 4 v “ 4 
NT-ignorable v “ “ 4 
NT-XI Vv 4 Vv 4 
NT-XS “ 4 уч 4 


44 Z. Lu and Z. Zhang 


NT-XY 4 4 v “ 4 
NN-ignorable v “ "4 72 
Х-ХІ 4 4 Vv 4 
NN-XS 4 v “ vy 4 
NN-XY 4 4 v 4 v 


Study 4 Robust Extended GMMs (REGMMs): select 5 competing models 
based on the performance in Study 3 use relative large sample sizes 
due to multiple-class data and varied class probabilities 


TN-CXS Vv Vv Y v v 
TN-CX Vv Vv v v 
TT-CXS 72 “уулуу Vv v v 
NN-CXS 72 4 Vv Vv v “ 
NN-CX 4 v Vv v v 


Study 5 Single-Class LGCMs vs. Multiple-Class RGMMs 
1 Class LGCMs 


TN-XS Vv "4 “ 

TT-XS "4 "4 "4 "4 

NN-XS "4 "4 "4 "4 

2 Classes RGMMs 

TN-XS Vv "4 "4 v 
TT-XS “ “ “ “ v 
NN-XS "4 "4 "4 "4 “ 
3 Classes RGMMs 

TN-XS Vv v “ 

TT-XS “ “ "4 v 

NN-XS "4 "4 “ v 

4 Classes RGMMs 

TN-XS Vv "4 “ 

TT-XS “ "4 "4 v 

NN-XS "4 "4 “ "4 


Note. 1 The shaded model is the true model. 2 Measurement errors. 
3 Random effects. 4 Normal distribution. 5 t distribution. 
6 Latent class dependent (Non-ignorable). 7 Observed Covariates. 
8 Latent intercept dependent (Non-ignorable). 9 Latent slope 
dependent (Non-ignorable). 10 Potential outcome y dependent 
(Non-ignorable). 11 Class Separation 
: medium=2.7). 


when generating data (S: small=1.7, 


Study 1 investigated the performance of the Bayesian indices when data 
were non-mixture, homogeneous, normally distributed with non-ignorable 
missingness. The true model was NN-XS, which was the model with normally 
distributed measurement errors (e;) at level 1 and random effects (£;) at level 
2, with missingness depending on covariate x and latent slope 9. Specifically, 


Bayesian Model Selection Indices 45 


e; ~ MN(0,L), n; ~ MN4(8,V) where 8 = (Intercept, Slope) = (1,3) and ¥ 
was a 2 by 2 symmetric matrix with Var(I) = 1, Cov(I, S) = 0, and Var(S) = 4. 
For missingness, the bigger the latent slope was, the higher the missing data rate 
would be. The missingness probit coefficients were set as yọ = (—1, —1, —1, — 1), 
Ят = (—1.5,—1.5, —1.5, —1.5), and yg = (0.5,0.5,0.5,0.5). For example, if a 
participant had a latent growth slope 3, with a covariate value 1, then his or 
her missing probability at each wave was т ~ 16%; if the slope was 5, with 
the same covariate value, the missing probability increased to т = 50%; but 
if the slope was 1, then the missing probability decreased to т = 2.3%. The 
covariate x was also generated from a normal distribution, z ~ N(1, sd = 0.2). 
In study 1, in total there were 16 conditions with 4 missingness mechanisms (XS 
non-ignorable, XY non-ignorable, XI non-ignorable, and ignorable) combined 
with 4 levels of sample size (1000, 500, 300, and 200). Table [B] lists the model 
selection proportions across 100 replications for each of these indices across all 
conditions in study 1. The largest proportion across 4 missingness models is 
indicated in the shaded cell for each index. When sample size is relatively large, 
1000 or 500, all of the model selection indices, except for the rough DIC (RDIC), 
correctly identify the true model with 100%. When sample size becomes smaller, 
300 or 200, except for the RDIC, all of the model selection indices choose the 
true model with certainty above 93%. Comparing the indices defined based on 
Dbar with those defined based on Dhat, one can see that the former performs a 
little bit better. 


Study 2 investigated the performance of these indices when data were 
non-mixture homogeneous with outliers and non-ignorable missingness. The 
main difference between study 2 and 1 was that the data for study 2 contain 
outliers such that they are not normally distributed. So robust growth curve 
models were used. The true model was TN-XS, which means measurement errors 
(еҙ) at level 1 followed a t-distribution. Specifically, e; were generated from a t 
distribution with 5 degrees of freedom and a scale matrix I, i.e., e; ~ Mt(0,I, 5). 
Other settings were kept the same as those in study 1. In this study, totally 32 
conditions were considered with 4 data distributions (NN, TN, NT, and TT), 
4 missingness patterns (XS non-ignorable, XY non-ignorable, XI non-ignorable, 
and ignorable), and 2 levels of sample size (1000 and 500). Tablel[4]lists the model 
selection proportions. The largest proportion across 16 missingness models is 
indicated in the shaded cells for each index. Except for the RDIC, all of the 
model selection indices correctly identify the true model. TT-XS is a competing 
model, which also gains high selection probabilities. This is because the normal 
distribution is almost identical to a t-distribution with large degrees of freedom. 
'The degrees of freedom of t is also estimated by the model. Also, the Dbar-based 
indices performs a little bit better than the Dhat-based indices. Among them, 
Dbar-based BIC and CAIC perform best. 


Study 3 was designed for mixture data with outliers and non-ignorable 
missing data. As data were mixture, growth mixture models were used. In 
this study, the true model was 2-class mixture TN-XS RGMM. Only intercepts 
of these 2 classes were different, with 5 for class 1 and 1 for class 2. Other 
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Table 3. Model Selection Proportion in Study 1 


N=1000 N=500 

Non-ignorable Ignorable Non-ignorable Ignorable 
Criteron! |NN-XS? NN-XY? NN-XI NN? |NN-XS NN-XY NN-XI NN 
Dbar.AIC ІШ 0.000 0.000 0.000 1 0.000 0.000 0.000 
Dbar.BIC 1 0.000 0.000 0.000 1 0.000 0.000 0.000 
Dbar.CAIC| 1 0.000 0.000 0.000 1 0.000 0.000 0.000 
Dbar.ssBIC| 1 0.000 0.000 0.000 1 0.000 0.000 0.000 
RDIC 0.013 0.000 0.987 0.000 0.038 0.000 0.962 0.000 
Dhat.AIC 1 0.000 0.000 0.000 1 0.000 0.000 0.000 
Dhat.BIC 1 0.000 0.000 0.000 1 0.000 0.000 0.000 
Dhat.CAIC| 1 0.000 0.000 0.000 1 0.000 0.000 0.000 
Dhat.ssBIC| 1 0.000 0.000 0.000 1 0.000 0.000 0.000 
DIC 1 0.000 0.000 0.000 1 0.000 0.000 0.000 
N=300 N=200 
Dbar.AIC | 0.98125 0.01875 0.000 0.000 0.975 0.025 0.000 0.000 
Dbar.BIC | 0.98125 0.01875 0.000 0.000 0.975 0.025 0.000 0.000 
Dbar.CAIC| 0.98125 0.01875 0.000 0.000 0.975 0.025 0.000 0.000 
Dbar.ssBIC| 0.98125 0.01875 0.000 0.000 0.975 0.025 0.000 0.000 
Rough DIC| 0.1125 0.000 0.8875 0.000 0.2 0.03125 0.76875 0.000 
Dhat.AIC | 0.95 0.05 0.000 0.000 | 0.9375 0.06875 0.000 0.000 
Dhat.BIC 0.95 0.05 0.000 0.000 | 0.9375 0.06875 0.000 0.000 
Dhat.CAIC| 0.95 0.05 0.000 0.000 | 0.9375 0.06875 0.000 0.000 
Dhat.ssBIC| 0.95 0.05 0.000 0.000 | 0.9375 0.06875 0.000 0.000 
DIC 1 0.000 0.000 0.000 | 0.98125 0.0125 0.00625 0.000 
Note. 


1. The definition of each index is given in Tab 


«n 


2. The shaded model is the true model. The model is normal-distribution-based 


with 


latent-slope-dependent missingness. 


3. The model is normal-distribution-based with potential-outcome-dependent 
missingness. 


4. 'The 


model 


missingness. 
5. The model is normal-distribution-based with ignorable missingness. 


6. 'The shaded cell has the largest proportion. 


is normal-distribution-based with 


latent-intercept-dependent 
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Table 4. Model Selection Proportion in Study 2 
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N=1000 N=500 
Non-ignorable Ignorable | Non-ignorable Ignorable 
Index XS? XY XI XS XY XI 
Dbar.AIC 'TN*|0.519 0.000 0.000 0.000 1|0.597 0.013 0.000 0.000 
ТТ? |0.469 0.000 0.000 0.012 [0.377 0.000 0.000 0.000 
NT? [0.000 0.000 0.000 0.000 |0.006 0.000 0.000 0.000 
NN*|0.000 0.000 0.000 0.000 — 0.006 0.000 0.000 0.000 
Dbar.BIC ‘TN |0.781 0.000 0.000 0.000 1|0.855 0.013 0.000 0.000 
ТТ |0.200 0.000 0.000 0.019  /0.113 0.000 0.000 0.000 
NT 0.000 0.000 0.000 0.000 — 10.006 0.000 0.000 0.000 
NN |0.000 0.000 0.000 0.000 — 10.013 0.000 0.000 0.000 
Dbar.CAIC TN |0.819 0.000 0.000 0.000 1|0.888 0.012 0.000 0.000 
TT |0.162 0.000 0.000 0.019 0.075 0.000 0.000 0.000 
NT |0.000 0.000 0.000 0.000 — 10.000 0.000 0.000 0.000 
NN |0.000 0.000 0.000 0.000 — 10.019 0.000 0.000 0.000 
Dbar.ssBIC TN |0.625 0.000 0.000 0.000 [0.631 0.012 0.000 0.000 
ТТ |0.362 0.000 0.000 0.012 |0.338 0.000 0.000 0.000 
NT 0.000 0.000 0.000 0.000 |0.006 0.000 0.000 0.000 
NN |0.000 0.000 0.000 0.000 |0.006 0.000 0.000 0.000 
RDIC TN |0.000 0.000 0.106 0.000 0.000 0.000 0.094 0.000 
ТТ |0.000 0.000 0.100 0.000 |0.000 0.000 0.113 0.000 
NT 0.000 0.000 0.394 0.000 |0.000 0.000 0.390 0.000 
NN |0.000 0.000 0.400 0.000 |0.000 0.000 0.403 0.000 
Dhat.AIC TN |0.544 0.000 0.000 0.000 {0.547 0.025 0.000 0.000 
ТТ |0.506 0.006 0.000 0.000 |0.447 0.019 0.000 0.000 
NT |0.000 0.000 0.000 0.000 — 10.000 0.000 0.000 0.000 
NN |0.000 0.000 0.000 0.000 [0.000 0.000 0.000 0.000 
Dhat.BIC TN |0.675 0.006 0.000 0.000  |0.717 0.025 0.000 0.000 
TT |0.319 0.000 0.000 0.000 |0.245 0.013 0.000 0.000 
NT 0.000 0.000 0.000 0.000 |0.000 0.000 0.000 0.000 
NN |0.000 0.000 0.000 0.000 |0.000 0.000 0.000 0.000 
Dhat.CAIC TN |0.700 0.006 0.000 0.000 {0.788 0.025 0.000 0.000 
ТТ |0.294 0.006 0.000 0.000 0.169 0.012 0.000 0.000 
NT |0.000 0.000 0.000 0.000 |0.000 0.000 0.000 0.000 
NN |0.000 0.000 0.000 0.000 |0.000 0.000 0.000 0.000 
Dhat.ssBIC TN |0.575 0.006 0.000 0.000 1|0.588 0.025 0.000 0.000 
TT |0.419 0.006 0.000 0.000 |0.369 0.012 0.000 0.000 
NT |0.000 0.000 0.000 0.000 |0.000 0.000 0.000 0.000 
NN |0.000 0.000 0.000 0.000 |0.000 0.000 0.000 0.000 
DIC TN |0.325 0.000 0.000 0.000 |0.415 0.006 0.000 0.000 
ТТ |0.462 0.000 0.000 0.194 |0.409 0.000 0.000 0.000 
NT 0.012 0.000 0.000 0.000 — 10.088 0.000 0.000 0.000 
NN |0.006 0.000 0.000 0.000 — 10.082 0.000 0.000 0.000 
Note. 1^ T-Normal, T-T, Normal-T, and Normal-Normal measurement errors and 


^g 


random effects. >Other abbreviations are as given in Table [3] 
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settings for each class were the same as in study 2. Both classes have ts 
distributed measurement errors. Based on (1962), the 
class separation is around 2.7. In this study, we assumed they are traditional 
mixture models, i.e., class probabilities were fixed at (50%, 50%) in this study. 
The same as in study 2, there were 32 conditions considered with 4 data 
distributions (NN, TN, NT, and TT), 4 missingness patterns (XS non-ignorable, 
XY non-ignorable, XI non-ignorable, and ignorable), and 2 levels of sample size 
(1000 and 1500). As mixture data require more data to obtain estimates, we 
increased the sample size. Table |5| shows the results for study 3. The shaded 
cell indicates the largest proportion across 16 missingness models for each index. 
Again, almost all of the model selection indices correctly identify the true model. 
And the Dbar-based indices perform a little bit better than the Dhat-based 
indices. Specifically, Dbar-based BIC and CAIC perform best among these 
indices, and then Dbar-based ssBIC also perform well. 

Study 4 extended study 3 such that the class probabilities were not 
fixed. Instead, they depended on values of covariates. Also, the non-ignorable 
missingness in this study was allowed to depend on the corresponding 
observations’ latent class membership. The true model in this study was 2-class 
mixture TN-CXS robust extended growth mixture models (REGMM). The 
differences between this study and study 3 were (1) the class proportions in this 
study were predicted by the value of covariate x; (2) the missing data rates were 
predicted by the latent class membership; and (3) both medium size, 2.7, and 
small size, 1.7, class separations were used. Specifically, for small class separation, 
the intercept for class 1 was 3.5 and the intercept for class 2 was 1. To simplify 
the simulation, based on the findings in study 3, 5 competing mixture models 
(TN-CXS, TT-CXS, TN-CX, NN-CXS, and NN-CX) were chosen to fit the data. 
Totally, we considered 20 conditions with 5 mixture models, 2 levels of sample 
size (1500 and 1000), and 2 levels of class separation (2.7 and 1.7). Table [6] 
shows the model selection proportions in study 4. Again, almost all of the model 
selection indices correctly identify the true model. Specifically, Dbar-based BIC 
and CAIC perform best among these indices. 

Study 5 focused on the number of classes. In this study, different growth curve 
models with different numbers of classes were fitted and compared. In total, 9 
conditions were considered, including 3 models (TN-XS, TT-XS, NN-XS) and 
3 numbers of classes (1, 2, and 3). The true model was the 2-class mixture 
TN-XS model. The simulation results for study 5 were presented in Table 
Among these indices, Dhat-based indices perform better than Dhbar-based 
indices. Specifically, Dhat-based BIC and CAIC perform best, and ssBIC and 
AIC also provide high certainty. 
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Table 5. Model Selection Proportion in Study 3 


N=1500 


N=1000 


Non-ignorable 


XS XY XI 


Ignorable 


Non-ignorable 


XS XY XI 


Ignorable 


Index 
Dbar.AIC  TN|0.621 0.000 0.000 0.000 |0.593 0.000 0.000 0.000 
''T' 0.357 0.000 0.000 0.000 10.314 0.000 0.000 0.000 
МТ (0.000 0.000 0.000 0.000 10.021 0.000 0.000 0.000 
NN/0.021 0.000 0.000 0.000 |0.071 0.000 0.000 0.000 
Dbar.BIC  TN|0.864 0.000 0.000 0.000 0.843 0.000 0.000 0.000 
ТТ|0.114 0.000 0.000 0.000 |0.064 0.000 0.000 0.000 
МТ (0.000 0.000 0.000 0.000 10.014 0.000 0.000 0.000 
NN/0.021 0.007 0.000 0.000 |0.079 0.000 0.000 0.000 
Dbar.CAIC TN|0.893 0.000 0.000 0.000 |0.857 0.000 0.000 0.000 
''T 0.079 0.000 0.000 0.000 10.043 0.000 0.000 0.000 
NT/0.000 0.000 0.000 0.000 |0.007 0.007 0.000 0.000 
МХ|0.021 0.007 0.000 0.000 |0.086 0.000 0.000 0.000 
Dbar.ssBIC TN|0.729 0.000 0.000 0.000 |0.750 0.000 0.000 0.000 
ТТ|0.250 0.000 0.000 0.000 |0.157 0.000 0.000 0.000 
NT/0.000 0.000 0.000 0.000 0.014 0.000 0.000 0.000 
NN/0.021 0.007 0.000 0.000 |0.079 0.000 0.000 0.000 
RDIC TN}0.071 0.000 0.000 0.000 10.143 0.000 0.000 0.000 
''T 0.086 0.000 0.000 0.000 |0.071 0.000 0.000 0.000 
NT/0.450 0.000 0.000 0.000 0.393 0.007 0.000 0.000 
NN/0.393 0.000 0.000 0.000 |0.379 0.007 0.000 0.000 
Dhat.AIC  TN|0.586 0.000 0.000 0.000 10.621 0.000 0.000 0.000 
ТТ|0.379 0.000 0.000 0.000 10.329 0.000 0.000 0.000 
NT/0.014 0.000 0.000 0.000 |0.014 0.007 0.000 0.000 
NN/0.014 0.007 0.000 0.000 |0.057 0.000 0.000 0.000 
Dhat.BIC  TN|0.757 0.000 0.000 0.000 |0.793 0.000 0.000 0.000 
ТТ|0.207 0.000 0.000 0.000 10.121 0.000 0.000 0.000 
NT/0.007 0.000 0.000 0.000 |0.007 0.007 0.000 0.000 
NN/0.021 0.007 0.000 0.000 |0.071 0.000 0.000 0.000 
Dhat.CAIC TN|0.757 0.000 0.000 0.000 10.814 0.000 0.000 0.000 
TT (0.207 0.000 0.000 0.000 |0.100 0.000 0.000 0.000 
NT/0.007 0.000 0.000 0.000 |0.007 0.007 0.000 0.000 
NN/0.021 0.007 0.000 0.000 |0.071 0.000 0.000 0.000 
Dhat.ssBIC 'TN|0.586 0.000 0.000 0.000 |0.664 0.000 0.000 0.000 
ТТ|0.379 0.000 0.000 0.000 10.250 0.000 0.000 0.000 
NT/0.014 0.000 0.000 0.000 |0.014 0.007 0.000 0.000 
NN/0.014 0.007 0.000 0.000 |0.064 0.000 0.000 0.000 
DIC Т |0.507 0.000 0.000 0.000 10.364 0.007 0.000 0.000 
'T'T' 0.371 0.000 0.000 0.000 1|0.286 0.000 0.000 0.000 
NT/0.043 0.036 0.000 0.000 0.129 0.029 0.007 0.000 
NN/0.043 0.000 0.000 0.000 |0.150 0.029 0.000 0.000 


Note. Same as Table ВІ 
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Table 6. Model Selection Proportion in Study 4 


Index TN-CXS TT-CXS NN-CXS TN-CX NN-CX 


Class Separation=2.7, N=1500 

Dbar. AIC 0.567 0.425 0.000 0.008 0.000 
Dbar.BIC 0.808 0.158 0.000 0.033 0.000 
Dbar.CAIC 0.850 0.108 0.000 0.0042 0.000 
Dbar.ssBIC 0.667 0.300 0.000 0.033 0.000 
RDIC 0.042 0.042 0.908 0.000 0.008 
[lpt] Dhat.AIC 0.475 0.392 0.000 0.133 0.000 
Dhat.BIC 0.550 0.233 0.000 0.217 0.000 
Dhat.CAIC 0.525 0.233 0.000 0.242 0.000 
Dhat.ssBIC 0.467 0.367 0.000 0.167 0.000 
DIC 0.467 0.500 0.033 0.000 0.000 
Class Separation=2.7, N=1000 

Dbar. AIC 0.558 0.375 0.000 0.067 0.000 
Dbar.BIC 0.750 0.125 0.000 0.125 0.000 
Dbar.CAIC 0.767 0.100 0.008 0.125 0.000 
Dbar.ssBIC 0.633 0.292 0.000 0.075 0.000 
RDIC 0.092 0.075 0.808 0.000 0.025 
[lpt] Dhat.AIC 0.350 0.358 0.000 0.292 0.000 
Dhat.BIC 0.450 0.175 0.000 0.375 0.000 
Dhat.CAIC 0.442 0.150 0.000 0.4 0.008 
Dhat.ssBIC 0.392 0.300 0.000 0.308 0.000 
DIC 0.417 0.450 0.108 0.008 0.017 
Class Separation=1.7, №1500 

Dbar. AIC 0.512 0.444 0.044 0.000 0.00 
Dbar.BIC 0.744 0.212 0.044 0.000 0.00 
Dbar.CAIC 0.781 0.175 0.044 0.000 0.00 
Dbar.ssBIC 0.612 0.344 0.044 0.000 0.00 
RDIC 0.306 0.238 0.350 0.006 0.10 
[lpt] Dhat.AIC 0.475 0.475 0.031 0.019 0.00 
Dhat.BIC 0.712 0.238 0.031 0.019 0.00 
Dhat.CAIC 0.712 0.238 0.031 0.019 0.00 
Dhat.ssBIC 0.475 0.475 0.031 0.019 0.00 
DIC 0.381 0.450 0.169 0.000 0.00 
Class Separation=1.7, N=1000 

Dbar.AIC 0.550 0.400 0.050 0.000 0.000 
Dbar.BIC 0.719 0.194 0.081 0.006 0.000 
Dbar.CAIC 0.750 0.162 0.081 0.006 0.000 
Dbar.ssBIC 0.638 0.300 0.062 0.000 0.000 
RDIC 0.244 0.256 0.362 0.000 0.138 
[lpt] Dhat.AIC 0.694 0.231 0.012 0.062 0.000 
Dhat.BIC 0.644 0.294 0.012 0.050 0.000 
Dhat.CAIC 0.694 0.231 0.012 0.062 0.000 
Dhat.ssBIC 0.575 0.388 0.012 0.025 0.000 
DIC 0.344 0.331 0.319 0.000 0.006 


Note. Same as Table [3] 


Bayesian Model Selection Indices 51 


Table 7. Model Selection Proportion in Study 5 


2 CLASSES 1 CLASS 3 CLASSES 

Index TN-XS ТТ-Х5 NN-XS|TN-XS ТТ-Х5 NN-XS TN-XS ТТ-Х5 NN-XS 
Dbar.AIC 0.000 0.000 0.057 0.393 0.129 0.000) 0.021 0.007 0.393 
Dbar.BIC 0.000 0.000 0.036 0.821 0.064 0.000) 0.000 0.000 0.079 
Dbar.CAIC,| 0.000 0.000 0.036) 0.864 0.043 0.000 0.000 0.000 0.057 
Dbar.ssBIC| 0.000 0.000 0.057} 0.593 0.100 0.000) 0.000 0.000 0.25 
RDIC 0.036 0.014 0.2) 0.014 0.014 0.679 0.014 0.014 0.014 
Dhat.AIC 0.621 0.343 0.064 0.000 0.000 0.000) 0.000 0.000 0.000 
Dhat.BIC 0.793 0.136 0.071 0.000 0.000 0.000) 0.000 0.000 0.000 
Dhat.CAIC| 0.814 0.114 0.071 0.000 0.000 0.000) 0.000 0.000 0.000 
Dhat.ssBIC| 0.664 0.264 0.071) 0.000 0.000 0.000) 0.000 0.000 0.000 


DIC 0.000 0.000 0.000} 0.164 0.193 0.121) 0.000 0.000 0.521 
Note. Same as Table 


5 Application 


In this section, a real data set on mathematical growth is analyzed to 
demonstrate the application of the indices. The same sample that has been 


analyzed in (2011) is used here. It is a mathematical ability 
growth sample from the NLSY97 survey (Bureau of Labor Statistics, U.S. 


1997), which were collected from N = 1510 adolescents 
yearly from 1997 to 2001 when each adolescent was administered the Peabody 
Individual Achievement Test (PIAT) Mathematics Assessment to measure their 
mathematical ability. There are some outliers at all five grades. 
conducted a power transformation to normalize the sample and assumed the 
data are normally distributed without outliers. In this study, however, we use 
the original non-transformed data with outliers, but robust methods are used. 
Also, different non-ignorable missingness mechanisms are considered. Overall, 
the means of mathematical ability increased over time with a roughly linear 
trend. The missing data rates range from 4.57% to 9.47%, and the raw data 
show the missing pattern is intermittent. About half of the sample is female. 
The analysis is conducted following the steps in Table|8} In step 1, a tentative 
model (the TT-ignorable model) is fitted to the data. Gender is a covariate. 
The estimates of degrees of freedom of t for both classes are 2.342 and 3.263 
for measurement errors and 75.65 and 50.96 for random effects, which indicates 
that measurement errors сап be better fitted using a t distribution while random 
effects are approximately normally distributed (i.e., a TN model). And then 
in step 2, to compare models with different non-ignorable missingness and 
numbers of classes, 10 models are fitted to the data. During estimation we 
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Table 8. Steps and Fitting Models in Real Data Analysis 


Step 1: Fit a tentative 2 classes model, and check 
the estimated df of t 


е; N; missingness 
Model NTNTCXIS Y 


TT-ignorable v "4 


Step 2: Try models with different missingness and 
number of classes 


2 Classes RGMMs 


TN-X Vv "4 

'TN-XI уч Vv 
TN-XS уч vv 
TN-XY Vv “ "4 
2 Classes REGMMs 

TN-CX Vv Vv 

TN-CXI уч “Уу 
TN-CXS уч уу Vv 
TN-CXY уч Vv “ 
3 Classes GMMs 

NN-X “ "4 "4 

4 Classes GMMs 

NN-X "4 "4 "4 


Step 3: Compare selection indices 


Step 4: Interpret results obtained from the selected 
model 


Note. Abbreviations are as given in Table DI 
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Table 10. Estimates of TN-CXY REGMM in Real Data Analysis 


Parameter Mean S.D. MC.e./S.D.! Lower? Upper? Geweke z* 
Intercept 8.647 0.037 0.026 8.572 8.717 0.007 
Slope 0.229 0.009 0.023 0.211 0.247 0.014 
Б P Мат (Г) 0.234 0.028 0.024 0.183 0.293 -0.009 
% Е Мат (5) 0.014 0.002 0.018 0.011 0.017 0.004 
8 O Cov(I1, S) -0.036 0.006 0.022 -0.049 -0.026 -0.005 
d Уаг(е) 0.044 0.004 0.031 0.037 0.053 0.024 
о ар” 2.386 0.205 0.043 2.118 2.900 0.050 
5 Intercept 6.196 0.047 0.020 6.103 6.287 0.054 
O Slope 0.315 0.011 0.022 0.295 0.336 0.036 
S j^ Var(I) 1.326 0.084 0.017 1.167 1.497 0.020 
5 © Мат (5) 0.034 0.004 0.022 0.027 0.042 0.010 
© OOCov(I,S) 0.010 0.014 0.021 -0.018 0.037 -0.023 
Var(e) 0.372 0.020 0.033 0.336 0.412 -0.061 
dfy 3.200 0.195 0.040 2.850 3.600 -0.042 
% 210° -0.214 0.119 0.051 -0.438 0.018 -0.039 
O vu -0.223 0.077 0.051 -0.372 -0.076 0.026 
ыла! -0.711 0.532 0.066 -1.843 0.204 -0.255 
€ vie -0.132 0.216 0.058 -0.527 0.310 0.231 
5 dai? -0.154 0.108 0.046 -0.368 0.058 0.008 
yy, -0.087 0.059 0.065 -0.190 0.038 0.251 
oo 702 -1.157 0.446 0.064 -2.097 -0.447 -0.373 
p d» 0.046 0.217 0.055 -0.345 0.489 0.347 
E E 0.113 0.114 0.046 -0.109 0.334 0.032 
Е ү? -0.108 0.045 0.062 -0.188 -0.021 0.330 
E o V03 -0.613 0.454 0.065 -1.519 0.163 -0.462 
a лз -0.057 0.181 0.056 -0.403 0.292 0.381 
= 8 хз -0.147 0.094 0.046 -0.332 0.038 0.045 
£ z үз -0.074 0.045 0.064 -0.155 0.022 0.459 
= 704 -0.032 0.512 0.066 -0.861 0.985 -0.426 
5 Via -0.324 0.204 0.059 -0.732 0.029 0.362 
8 Yona 0.059 0.101 0.047 -0.142 0.251 0.128 
о үл -0.166 0.050 0.065 -0.266 -0.084 0.378 
17705 -1.298 0.421 0.065 -2.130 -0.442 -0.192 
2 yis 0.341 0.176 0.055 0.015 0.708 0.159 
© Yas -0.087 0.091 0.045 -0.263 0.083 0.001 
Үү -0.019 0.040 0.064 -0.092 0.062 0.189 

1 Ratio of MC error to standard deviation. A value around or less than 0.05 


indicates that the corresponding estimate is accurate (Spiegelhalter, Thomas, 
Best, & Lunn, 2003). 


2-3 The lower 2.5 percentile and upper 97.5 percentile. 


4 


5 
6 
7 
8 
9 
1 


Geweke test z value. An absolute value less than 1.96 indicates that the 
corresponding chain has passed the convergence test. 

The degrees of freedom of the multivariate-t. 

The probit coefficient of the class probability for class 1, defined in а 
The probit coefficient of the class membership 1 at Grade 7, defined in Eqn.(9). 
The probit coefficient of the class membership 2 at Grade 7, defined in Eqn. (9). 
The probit coefficient of the covariate at Grade 7, defined in и 


0 The probit coefficient of the potential output Y at Grade 7, defined in Eqn. (9). 
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use uninformative priors which carry little information for model parameters. 
A burn-in period is run first to ensure estimates are based on the Markov chains 
that have converged. For testing convergence, the history plot is examined and 
the Geweke’s z statistic is checked for each parameter. The 
Geweke’s z statistics for all the parameters are smaller than 1.96, which indicates 
converged Markov chains. To make sure all the parameters are estimated 
accurately, the next 50,000 iterations are then saved for data analysis. The ratio 
of Monte Carlo error (MCerror) to standard deviation (S.D.) for each parameter 
is smaller than or close to 0.05, which indicates parameter estimates are accurate 
(Spiegelhalter, Thomas, Best, & Lunn] 2003). In step 3, model selection indices 
are used to compare the ten models. The indices are listed in Table [9] And in 
step 4, the results obtained from the final selected model are interpreted. 

As suggested by Dhat.CAIC, Dhat.ssBIC, Dhat.BIC, and Dhat.AIC, without 
further substantive information, the TN-CXY model appears to be a good 
candidate for the best-fitting model. Table[10|provides the results of the TN-CXY 
REGMM model. It can be seen that (1) class 1 has a higher average initial level 
but a smaller average slope; (2) class 2 has larger variations for initial levels and 
slope; (3) the residual variance of class 2 is much larger than that of class 1; (4) in 
class 1 the initial level and the slope are significantly negatively correlated at the 
confidence level of 9596; (5) the missingness is not related to gender because none 
of the coefficients of gender are significant at the a level of 0.05; (6) at grade 11, 
adolescents in class 2 are more likely to miss tests than those in class 1 because 
the probit coefficient of class membership for grade 11 is significantly positive; 
and (7) at grades 8 and 10, students with higher potential scores are more likely 
to miss tests than the students having lower scores because the probit coefficients 
of the potential outcomes y at the two grades are significantly negative. 


6 Conclusions, Discussion and Future Research 


Based on the results from the five simulation studies, one can conclude that (1) 
almost all of the model selection indices, except for the rough DIC (RDIC), 
can correctly choose the true model with high certainty; (2) if the number 
of classes is correctly identified, then the Dbar-based indices perform better 
than the Dhat-based indices; if candidate models have different numbers of 
classes, then the Dhat-based indices might be used to select the best fit model; 
(3) across 5 studies, CAIC and BIC provide higher probabilities than those 
ssBIC, AIC, or DIC does. The results will help inform the selection of growth 
models by researchers seeking to provide people with accurate estimates of 
growth across a variety of possible contexts. The real data analysis demonstrated 
the application of the indices to typical longitudinal growth studies such as 
educational, psychological, and social research. 

'This study can be extended in many ways. For example, different versions 
of the likelihood function or more model selection indices can be studied and 
compared by using more practical statistical models. (1) As we stated in 
the section of Introduction, there are at least three challenges in proposing 
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new selection indices. The third challenge is about the likelihood function 
1(у|@). When latent variables involved, the likelihood can be an observed-data 
likelihood, a complete-data likelihood, or a conditional likelihood 
2006). In this study, we use а conditional joint loglikelihood, but in the future, 
the other versions of likelihood functions can be investigated. (2) Another future 
research of this study is to propose other model selection indices, such as Bayes 
factors. (3) This study focuses on latent growth models only. In the future, the 
performance of these selection indices can be studied by using other statistical 
models, such as survival models. 
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